Data from: Lopez-Yrigoyen M, Yang CT, Fidanza A, Cassetta L et al. Genetic programming of macrophages generates an in vitro model for the human erythroid island niche. Nat Commun 2019 Feb 20;10(1):881.

Analysis from A1

Prepare packages required for this analysis.

# Load Biobase.
if (! require(Biobase, quietly=TRUE)) {
  if (! exists("biocLite")) {
    source("https://bioconductor.org/biocLite.R")
  }
  biocLite("Biobase")
  library(Biobase)
}
# Load GEOquery.

if (! require(GEOquery, quietly=TRUE)) {
  if (! exists("biocLite")) {
    source("https://bioconductor.org/biocLite.R")
  }
  biocLite("GEOquery")
  library(GEOquery)
}

library(knitr)
library(edgeR)

Retrieve GEO data for GSE125150.

# GSE125150 <- getGEO("GSE125150", GSEMatrix = FALSE)
GSE125150 <- getGEO(filename = "~/projects/GSE125150_family.soft", GSEMatrix = FALSE)
## Reading file....
## Parsing....
## Found 9 entities...
## GPL20301 (1 of 10 entities)
## GSM3564285 (2 of 10 entities)
## GSM3564286 (3 of 10 entities)
## GSM3564287 (4 of 10 entities)
## GSM3564288 (5 of 10 entities)
## GSM3564289 (6 of 10 entities)
## GSM3564290 (7 of 10 entities)
## GSM3564291 (8 of 10 entities)
## GSM3564292 (9 of 10 entities)
kable(data.frame(head(Meta(GSE125150))), format = "html")
contact_address contact_city contact_country contact_email contact_institute contact_laboratory
5 Little France Dr Edinburgh United Kingdom Centre for Regenerative Medicine Forrester
gpl <- names(GPLList(GSE125150))[1]
# gpl_info <- Meta(getGEO(gpl))

Retrieve supplementary files for the differential expressions.

gene_id source gene_version gene_name gene_biotype phase K2MT_A K2MT_B K2MT_C
ENSG00000000003 ENSG00000000003 ensembl 14 TSPAN6 protein_coding NA 557 384 416
ENSG00000000005 ENSG00000000005 ensembl_havana 5 TNMD protein_coding NA 0 0 0
ENSG00000000419 ENSG00000000419 ensembl_havana 12 DPM1 protein_coding NA 1052 1082 1093
ENSG00000000457 ENSG00000000457 ensembl_havana 13 SCYL3 protein_coding NA 483 543 627
ENSG00000000460 ENSG00000000460 havana 16 C1orf112 protein_coding NA 278 295 398
ENSG00000000938 ENSG00000000938 ensembl_havana 12 FGR protein_coding NA 7019 5414 5931
ENSG00000000971 ENSG00000000971 ensembl_havana 15 CFH protein_coding NA 91 52 243
ENSG00000001036 ENSG00000001036 ensembl_havana 13 FUCA2 protein_coding NA 5648 4683 6030
ENSG00000001084 ENSG00000001084 havana 10 GCLC protein_coding NA 20047 18149 23182
ENSG00000001167 ENSG00000001167 ensembl_havana 14 NFYA protein_coding NA 2588 2366 2766
ENSG00000001460 ENSG00000001460 ensembl_havana 17 STPG1 protein_coding NA 137 76 83
ENSG00000001461 ENSG00000001461 ensembl_havana 16 NIPAL3 protein_coding NA 1025 562 730
ENSG00000001497 ENSG00000001497 ensembl_havana 16 LAS1L protein_coding NA 1155 974 1178
ENSG00000001561 ENSG00000001561 ensembl_havana 6 ENPP4 protein_coding NA 122 97 128
ENSG00000001617 ENSG00000001617 havana 11 SEMA3F protein_coding NA 9 4 30

Clean data if needed.

## named integer(0)

The data provided was thorough enough to have HGNC symbols for each gene, so there was no need for identifier mapping.

Filter any uninformative or weakly expressed data after converting counts to cpm.

## [1] 60675    15

Started with 60675 genes. Filtered down to 13872 genes.

Plotting the density distribution of my filtered and cpm-converted data.

Plot and compare the distribution between pre- and post-normalization.

Save the data.

# Convert back to data.frame so that we can associate gene names with the counts again, because
# matrices cannot hold strings and numerics together.

a <- as.data.frame(normalized_counts)
b <- c("gene_id", colnames(a))
normalized_counts <- merge(expr_filtered[c(2, 5)], cbind(rownames(a), a), by.x="gene_id", by.y="rownames(a)")

write.table(normalized_counts, "~/GSE125150_normalized_counts.txt", sep = "\t")

Analysis for A2

Introduction

In the previous assignment, data was retrieved from GEO with the id GSE125150. This data is explores how KLF-1 expression affects macrophage function during the maturation of erythrocyte, in humans. After the raw gene count data underwent conversion to cpm, any uninformative or weakly expressed genes were omitted. Then, the cleaned data was normalized using log2(cpm) and further analysis with MDS plot and mean-variance relationship showed that the samples clustered based on treatment (KLF-1+ vs. KLF-1-) and that the data followed a negative binomial distribution. After cleaning and normalization, the dataset was reduced to 13872 genes starting from 60675.

Differential Expression Analysis

Prepare packages required for this analysis.

# Load Biobase.
if (! require(Biobase, quietly=TRUE)) {
  if (! exists("biocLite")) {
    source("https://bioconductor.org/biocLite.R")
  }
  biocLite("Biobase")
  library(Biobase)
}
# Load GEOquery.

if (! require(GEOquery, quietly=TRUE)) {
  if (! exists("biocLite")) {
    source("https://bioconductor.org/biocLite.R")
  }
  biocLite("GEOquery")
  library(GEOquery)
}

library(knitr)
library(edgeR)

Load normalized data from previous assignment.

normalized_count_data <- read.table(file=file.path("~/GSE125150_normalized_counts.txt"))
# Check if data imported correctly:
normalized_count_data[1:5,1:5]

Prepare for creating a heatmap and downstream analysis.

# Converting to matrix, because we have a data.frame:
heatmap_matrix <- normalized_count_data[,3:ncol(normalized_count_data)]
rownames(heatmap_matrix) <- normalized_count_data$gene_id
colnames(heatmap_matrix) <- colnames(normalized_count_data[,3:ncol(normalized_count_data)])
heatmap_matrix
library(ComplexHeatmap)
library(circlize)

# Create a heat map
if(min(heatmap_matrix) == 0){
    heatmap_col = colorRamp2(c( 0, max(heatmap_matrix)), c( "white", "red"))
  } else {
    heatmap_col = colorRamp2(c(min(heatmap_matrix), 0, max(heatmap_matrix)), c("blue", "white", "red"))
  }
current_heatmap <- Heatmap(as.matrix(heatmap_matrix),
                               show_row_dend = TRUE,
                               show_column_dend = TRUE, 
                               col=heatmap_col,
                               show_column_names = TRUE, 
                               show_row_names = FALSE,
                               show_heatmap_legend = TRUE
                               )


# Apply row-normalization
# Subtract the mean from each value and divide by the standard dev of the row to get row-normalization
# (scale() function does this for us).
heatmap_matrix <- t(scale(t(heatmap_matrix)))
if(min(heatmap_matrix) == 0){
    heatmap_col = colorRamp2(c( 0, max(heatmap_matrix)), c( "white", "red"))
  } else {
    heatmap_col = colorRamp2(c(min(heatmap_matrix), 0, max(heatmap_matrix)), c("blue", "white", "red"))
  }
current_heatmap <- Heatmap(as.matrix(heatmap_matrix),
                               show_row_dend = TRUE,
                               show_column_dend = TRUE, 
                               col=heatmap_col,
                               show_column_names = TRUE, 
                               show_row_names = FALSE,
                               show_heatmap_legend = TRUE
                               )

# Roughly, our data clusters based on treatment (KLF-1+ or KLF-1-), also shown by the MDS plot in A1.
plotMDS(heatmap_matrix, labels = rownames(samples), col = c("darkgreen", "blue")[factor(samples$Condition)])

Create a model, calculate p-values, and apply multiple hypothesis testing.

# Define groups:
samples <- data.frame(
        lapply(colnames(normalized_count_data)[3:10],
               FUN=function(x){
                 unlist(strsplit(x, split="_"))[c(2,1)]}))
colnames(samples) <- colnames(normalized_count_data)[3:10]
rownames(samples) <- c("sample", "treatment")
samples <- data.frame(t(samples))
samples
# Create data matrix
model_design <- model.matrix(~ samples$treatment)
kable(model_design, type="html")
(Intercept) samples$treatmentK2PT
1 0
1 0
1 0
1 0
1 1
1 1
1 1
1 1
expression_matrix <- as.matrix(normalized_count_data[,3:10])
rownames(expression_matrix) <- normalized_count_data$gene_id
colnames(expression_matrix) <- colnames(normalized_count_data)[3:10]
minimal_set <- ExpressionSet(assayData=expression_matrix)

# Fit the data to our linear model
fit <- lmFit(minimal_set, model_design)

# Apply empirical Bayes to compute differential expressions for the model
fit2 <- eBayes(fit,trend=TRUE)

topfit <- topTable(fit2, 
                   coef=ncol(model_design),
                   adjust.method = "BH",
                   number = nrow(expression_matrix))

# Merge gene names to topfit table
output_hits <- merge(normalized_count_data[,1:2],
                     topfit,
                     by.y=0,by.x=1,
                     all.y=TRUE)
# Sort by pvalue
output_hits <- output_hits[order(output_hits$P.Value),]
output_hits
# Check the results
kable(output_hits[1:10,],type="html")
gen e_id gen e_name logFC AveExpr t P. Value ad j.P.Val B
5581 ENSG00000136999 NOV 34.887889 21.792339 26.50301 0 2.35e-05 1.839216
6348 ENSG00000143537 ADAM15 200.047348 182.182205 26.17249 0 2.35e-05 1.834049
824 ENSG00000068078 FGFR3 6.509106 4.371374 24.02113 0 2.35e-05 1.795317
5062 ENSG00000133101 CCNA1 -16.752138 10.358061 -23.54767 0 2.35e-05 1.785418
5416 ENSG00000135931 ARMC9 -109.559602 91.409905 -23.49629 0 2.35e-05 1.784309
2556 ENSG00000106123 EPHB6 38.446298 25.271772 23.13532 0 2.35e-05 1.776325
2957 ENSG00000110651 CD81 -320.335578 811.888969 -20.13071 0 6.12e-05 1.693589
8878 ENSG00000167779 IGFBP6 24.459781 15.279415 19.77006 0 6.12e-05 1.681245
1515 ENSG00000090776 EFNB1 86.553752 84.079504 19.51893 0 6.12e-05 1.672275
4625 ENSG00000128918 ALDH1A2 -169.172552 150.016291 -19.11615 0 6.51e-05 1.657203
length(which(output_hits$P.Value < 0.01))
## [1] 2841
length(which(output_hits$adj.P.Val < 0.01))
## [1] 1337

For Multiple Hypothesis Testing, I used the BH method. According to the documentation and literature, BH method controls for the false discovery rate (a less stringent condition than family-wise error rate) and therefore is a more powerful method than those that control for family-wise error rate. My dataset allows many genes to pass even with the stringent condition (with BH), so I decided to use BH. Before correction, 2841 genes passed a cutoff of 0.01, and after correction with the BH method, 1337 genes passed. With a cutoff of 0.05, there were 2859 (after correction) and I considered this as excessive, so I decided to make the cutoff more strict. I could have made the cutoff higher to 0.001; however, I was afraid I would disregard interesting genes, so I kept the cutoff at 0.01 as it is a generally accepted p-value for high significance.

Create heat map based on p-value cutoff.

top_hits <- output_hits$gene_id[output_hits$adj.P.Val<0.01]
top_hits
heatmap_matrix_tophits <- t(
  scale(t(heatmap_matrix[
    which(rownames(heatmap_matrix) %in% top_hits),])))
if(min(heatmap_matrix_tophits) == 0){
    heatmap_col = colorRamp2(c( 0, max(heatmap_matrix_tophits)), 
                             c( "white", "red"))
  } else {
    heatmap_col = colorRamp2(c(min(heatmap_matrix_tophits), 0, max(heatmap_matrix_tophits)), c("blue", "white", "red"))
  }
current_heatmap <- Heatmap(as.matrix(heatmap_matrix_tophits),
                           cluster_rows = TRUE,
                           cluster_columns = TRUE,
                               show_row_dend = TRUE,
                               show_column_dend = TRUE, 
                               col=heatmap_col,
                               show_column_names = TRUE, 
                               show_row_names = FALSE,
                               show_heatmap_legend = TRUE,
                               )
current_heatmap

Based on the heatmap, the conditions (KLF-1+ vs. KLF-1-) cluster together. All the samples in KLF-1+ (K2PT) share similar differential expression profiles and all the samples in KLF-1- (K2MT) share similar differential expression profiles.

ORA

Retrieve thresholded list for upregulated and downregulated genes

output_hits[,"rank"] <- -log(output_hits$adj.P.Val,base =10) * sign(output_hits$logFC)
output_hits <- output_hits[order(output_hits$rank),]

upregulated_genes <- output_hits$gene_name[
  which(output_hits$adj.P.Val < 0.01 
             & output_hits$logFC > 0)]

downregulated_genes <- output_hits$gene_name[
  which(output_hits$adj.P.Val < 0.01 
             & output_hits$logFC < 0)]

upregulated_genes[1:10]
##  [1] B4GALT6       MAP7D1        AKAP12        CLTCL1        ABLIM1       
##  [6] IL10          NDUFAF5       RP11-550P17.5 TRGV6         DNM1L        
## 13869 Levels: A1BG-AS1 A2M A2M-AS1 AAAS AACS AADAC AADAT AAED1 AAGAB ... ZZZ3
downregulated_genes[1:10]
##  [1] CCNA1   ARMC9   CD81    ALDH1A2 ID2     MAN1A1  PLP2    CTNNB1 
##  [9] H6PD    HS3ST1 
## 13869 Levels: A1BG-AS1 A2M A2M-AS1 AAAS AACS AADAC AADAT AAED1 AAGAB ... ZZZ3
write.table(x=upregulated_genes,
            file=file.path("KLF1_upregulated_genes.txt"),sep = "\t",
            row.names = FALSE,col.names = FALSE,quote = FALSE)

write.table(x=downregulated_genes,
            file=file.path("KLF1_downregulated_genes.txt"),sep = "\t",
            row.names = FALSE,col.names = FALSE,quote = FALSE)

ORA with g:Profiler

The extracted genes were put through g:Profiler for ORA. g:Profiler was used because of its easy manipulation. The following options were used. Significance threshold: Benjamini-Hochberg (Again, because of its powerfulness due to its stringence.) User threshold: 0.05 Annotation sources: GO molecular function, GO biological process, Reactome, WikiPathways (These sources were used because we want to know the mechanism of action behind the effect of KLF1+ macrophages on the maturation of erythrocytes, and what kinds of pathways may be involved for this maturation to occur. Other sources, like GO cellular component, were not relevant because we are interested in a broader picture on the pathways and processes)

Below are the images of related terms for the upregulated genes:

knitr::include_graphics("upregulated_genes_1.png")

knitr::include_graphics("upregulated_genes_2.png")

knitr::include_graphics("upregulated_genes_3.png")

knitr::include_graphics("upregulated_genes_4.png")

knitr::include_graphics("downregulated_genes_1.png")

knitr::include_graphics("downregulated_genes_2.png")

knitr::include_graphics("downregulated_genes_3.png")

Interpretations

The results of this analysis broadly supports the original paper’s conclusion, in that KLF-1 induction does lead to differentiation to macrophages that are involved in erythrocyte maturation. The result of g:Profiler shows that upregulated genes are involved in immune system processes, as well as cell-cell adhesion and communication. Also, protein kinase activity seems to be upregulated, and this makes sense because kinases are often related with signalling pathways and activations. Interestingly, these genes are also involved in neuronal processes which could potentially unveil interesting relationships between immune cells that are resident in umbilical cords vs. near neurons. Furthermore, the downregulated genes appear to be involved in kinase inhibition, which corresponds to the upregulated genes. One interesting process that is shown to be downregulated in KLF-1+ macrophages is the regulation of CD4 T cells. This could possibly be due to the fact that these macrophages are obligates to helping the maturation of erythrocytes.

Gene Set Enrichment Analysis

Extract gene list and ranks for .rnk file

# Filter to only significant genes
output_hits_filtered <- (subset(output_hits, 
                           output_hits$adj.P.Val < 0.01))

# Flip order because rank is in ascending order
output_hits_desc <- output_hits_filtered[nrow(output_hits_filtered):1, ]

# Write only the second(gene name) and ninth(rank) columns. 
write.table(x=output_hits_desc[c(2,9)],
            file=file.path("KLF1_genelist.rnk"),sep = "\t",
            row.names = FALSE,col.names = FALSE,quote = FALSE)

# Top 10 genes and ranks.
kable((output_hits_desc[c(2,9)])[1:10,], format = "html")
gene_name rank
2556 EPHB6 4.628656
824 FGFR3 4.628656
6348 ADAM15 4.628656
5581 NOV 4.628656
1515 EFNB1 4.213202
8878 IGFBP6 4.213202
9648 C1QB 3.898630
2806 ABI3 3.898630
11590 GRK6 3.883293
258 DNASE1L1 3.883293

Run GSEA

GSEA 4.0.3 was used. Geneset was retrieved from http://download.baderlab.org/EM_Genesets/current_release/Human/symbol/. Because we want the latest GO biological processes and do not want anything inferred from IEA, Human_GOBP_AllPathways_no_GO_iea_April_01_2020_symbol.gmt was used. The following parameters were used: - No collapse - Max size 200 - Min size 15 - Everything else as default.

Below are the top 10 enriched gene sets from upregulated and downregulated genes, respectively.

knitr::include_graphics("GSEA_top10_pos.png")

knitr::include_graphics("GSEA_top10_neg.png")

Based on these lists, the upregulated genes are enriched in neuron projection guidance, which suggests that the genes are involved in neuron development. Furthermore, other top gene sets are also involved in CNS development, such as axon development. These are known results from previous research, for example by Moore, Apara and Goldberg (2011), where KLF genes are transcription factors involved in neuron growth and axon regeneration. Furthermore, Bieker, Ouyang and Chen (1998) also state that among the KLF genes, post-translational phosphorylation of KLF1 allows it to bind beta-globin promotor in erythroid cell lines that do not express beta-globin yet. To support this, the 15th gene set (not shown in figure above) related to positive regulation of tyrosine-peptidyl phosphorylation.
Also, a few of the top gene sets appear to be involved in immune cells and function.
The downregulated genes are enriched in regulation of protein kinase activity. We cannot conclude if this is allowing phosphorylation of KLF-1 since the geneset term is not specific enough in stating whether it positively or negatively regulates kinase activity.

The stats associated with neuron projection guidance are:
- ES = 0.4
- NES = 2.48
- FDR = 0.065
- 31 genes in leading edge
- Top gene: EFNB1; Receptor tyrosine kinase involved in migration, repulsion and adhesion during neuronal, vascular and epithelial development.

The stats associated with regulation of protein kinase activity are:
- ES = -0.2
- NES = -1.88
- FDR = 1.000
- 59 genes in leading edge
- Top gene: MAP3K6 in the MAP kinase signal transduction

Compared to the analysis in A2, the gene set results are more specific to neuronal and axonal growth, rather than general cell binding activity. I think these results can be compared because regardless of thresholded or nonthresholded, the scores observed among the genes prior to gene set enrichment analysis are very good, and therefore the thresholds do not affect the gene list much. With that being said, the thresholds are arbitrary while GSEA uses a non-thresholded approach, being fair by including all genes. So, although I say that they can be compared, I think the nonthresholded analysis would yield a more robust result.

Enrichment Map

For the following steps, Cytoscape 3.7.2 was used, along with its Apps.

Initial Enrichment Map production

The GSEA results were imported into Cytoscape using the Enrichment Map App. The FDR q-value cutoff was set to 0.1 (initially 0.01 but did not allow any to pass). Even then, this only allowed < 30 terms so instead a p-value cutoff of 0.05 was used. Below is the initial enrichment map.

knitr::include_graphics("initial_enrichment_map_final.png")

The enrichment map had 153 Nodes and 827 edges.
5 largest annotated clusters:
- ion homeostasis transport
- activation neutrophil immune
- cell morphogenesis differentiation
- inflammatory response oxygen
- regulation phosphorylation activity

AutoAnnotate App was used to annotate the EM, using Community Cluster and Word Cloud: Adjacent Words. The default MCL cluster did not work and other methods were not as informative as Community Cluster. Below is the annotated enrichment map.

knitr::include_graphics("annotated_enrichment_map.png")

The clusters were filtered down to only those containing >3 terms. Below is the filtered annotated enrichment map, with a legend.

knitr::include_graphics("enrichment_map_final_legends_added.png")

The clusters were collapsed to create a themed enrichment map. The major themes highlighted here are:
- neutrophil activation, inflammatory response in presence of oxygen, cell morphogenesis and differentiation (likely related to immune cells)
- regulation of phosphorylation
- ion transport and homeostasis
- cell adhesion
- matrisome
- apoptosis
These themes mildly fit with the model, where KLF-1 in macrophage is contributing to creating an erythroblastic island niche. Although the analysis is not suggesting anything as specific as the model, it does suggest how KLF-1 is involved in cell morphogenesis, likely of immune cells, since the cluster is connected to the inflammatory response cluster. Based on this EM, an interesting and novel pathway seen is neutrophil activation. As far as I have researched for papers, there were no direct KLF-1 activity associated with neutrophil activation.

Interpretation

  1. Do the enrichment results support conclusions or mechanism discussed in the original paper? How do these results differ from the results you got from Assignment #2 thresholded methods?
    Partially. This result suggests how KLF-1 could be involved in cell morphogenesis of immune cells, but does not go as far to suggest which immune cells it plays a role in, aside from neutrophils (which is not the conclusion discussed in the paper). Based on the specific GO BP terms in the EM, the analysis more so supports that KLF-1 plays a role in cell morphogenesis of neuronal cells, instead of erythrocytes.
    Moreover, these results were much more specific compared to Assignment 2, where results were not specific for immune cells and more geared towards neurons. The rest were quite similar, for example, capturing terms associated with protein kinase activity and cell adhesion.

  2. Can you find evidence, i.e. publications, to support some of the results that you see. How does this evidence support your result?
    Yes. As stated previously in this report, Moore, Apara and Goldberg (2011) have concluded that the KLF family of transcription factors are involved in neurite outgrowth and axon regeneration, which are directly related to what we have observed in our enrichment map, under the cell morphogenesis theme. Furthermore, Bieker, Ouyang and Chen (1998) have shown that KLF-1, upon phosphorylation, allows transcription of beta-globin, in erythroid cell lines that have not yet expressed beta-globin. The enrichment map has a ‘regulation of phosphorylation’ theme, with many terms relating to positive regulation associated with KLF-1+ cells. Therefore, results from Bieker, Ouyang and Chen support the results seen here.

Post-Analysis

After enrichment analysis, my results seemed quite close to those that were achieved by the original publication. One thing that lacked in my results was specificity for erythrocytes, so I decided to explore this further to see if I could possibly get closer to the results from the original publication. Therefore, I selected the erythrocyte differentiation pathway signature set from MSigDB. By using this for post analysis, I figured that I could arrive at a more confident conclusion on whether my results are in line with the original publication.

The gene set I have selected is BIOCARTA_ERYTH_PATHWAY, from MSigDB (https://www.gsea-msigdb.org/gsea/msigdb/geneset_page.jsp?geneSetName=BIOCARTA_ERYTH_PATHWAY&keywords=erythrocyte).

I imported this into the main enrichment map that I generated and below is the post-analysis enrichment map. I’ve set the parameters to Mann-Whitney (One-sided Greater) because I want the erythrocyte differentiation genes to be expressed at the top end genes in my dataset. Unfortunately, the statistical values were not significant (although almost significant), so the cutoff was set to 0.15 instead of the ideal 0.05.

knitr::include_graphics("postanalysis_eryth.png")

This post analysis shows that indeed many of the genes expressed in KLF-1+ macrophages are related to erythrocyte differentiation, which supports the findings made by the original study. Therefore, although not confident due to insignificant statistical values, we can say that the genes are likely involved in inflammatory responses and cell morphogenesis for erythrocytic cell lines, and neutrophil activation could play an important role in differentiation of erythrocytes.
* The original network was subsetted to include only the nodes from clusters with >3 terms, in the post analysis.

References

Ouyang L, Chen X, Bieker JJ. Regulation of erythroid Krüppel-like factor (EKLF) transcriptional activity by phosphorylation of a protein kinase casein kinase II site within its interaction domain. J Biol Chem. 1998 Sep 4;273(36):23019-25.

Isserlin et al. Enrichment Map – a Cytoscape app to visualize and explore OMICs pathway enrichment results. F1000Res. 2014; 3:141.

Kucera et al. AutoAnnotate: A Cytoscape app for summarizing networks with semantic annotations. F1000Res. 2016 July 15;5:1717.

Liberzon et al. Molecular signatures database (MSigDB) 3.0. Bioinformatics. 2011 June 15;27(12):1739-40.

Lopez-Yrigoyen M, Yang CT, Fidanza A, Cassetta L et al. Genetic programming of macrophages generates an in vitro model for the human erythroid island niche. Nat Commun 2019 Feb 20;10(1):881.

Montojo et al. GeneMANIA Cytoscape plugin: fast gene function predictions on the desktop. Bioinformatics. 2010 Nov 15;26(22):2927-2928.

Moore DL, Apara A, and Goldberg JL. Krüppel-like transcription factors in the nervous system: novel players in neurite outgrowth and axon regeneration. Mol Cell Neurosci. 2011 Aug;47(4):233-43.

Morris et al. clusterMaker: a multi-algorithm clustering plugin for Cytoscape. BMC Bioinformatics. 2011 November 9;12(436).

Oesper et al. WordCloud: a Cytoscape plugin to create a visual semantic summary of networks. Source Code Biol Med. 2011 April 7;6:7.

Shannon et al. Cytoscape: A Software Environment for Integrated Models of Biomolecular Interaction Networks. Genome Res. 2003 Nov;13(11):2498-504.

Subramanian et al. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci USA. 2005 Oct 25;102(43):15545-50.

Ritchie ME, Phipson B, Wu D, Hu Y, Law CW, Shi W, Smyth GK (2015). “limma powers differential expression analyses for RNA-sequencing and microarray studies.” Nucleic Acids Research, 43(7), e47. doi: 10.1093/nar/gkv007.

Robinson MD, McCarthy DJ, Smyth GK (2010). “edgeR: a Bioconductor package for differential expression analysis of digital gene expression data.” Bioinformatics, 26(1), 139-140. doi: 10.1093/bioinformatics/btp616.

Uku Raudvere, Liis Kolberg, Ivan Kuzmin, Tambet Arak, Priit Adler, Hedi Peterson, Jaak Vilo: g:Profiler: a web server for functional enrichment analysis and conversions of gene lists (2019 update) Nucleic Acids Research 2019; doi:10.1093/nar/gkz369